What if you could extract, organize, and visualize data from NASIS and many other commonly used soil database sources with a couple of lines of code?
The importance of pedon data for present and future work cannot be overstated. These data represent decades of on-the-ground observations of the soil resource for a given area. As difficult as it may be to take the time to enter legacy pedon data, it is vitally important that we capture this resource and get these data into NASIS as an archive of point observations.
Our confidence in observations commonly weakens with the depth of the material described.
For more information regarding difficult pedon data, see the following tutorial in the “aqp” package:
Dealing with Troublesome data.
The soilDB package for R works with soil-resource-related data sources. It has a series of convenience functions for accessing data in NASIS, KSSL, SDA, and other sources. The fetchNASIS convenience function extracts data from a NASIS selected set via Structured Query Language (SQL). Basic data checks are run within the fetch functions, then the data are assembled into a combined site-level and horizon-level data structure within a custom R object called a Soil Profile Collection (SPC). The SoilProfileCollection class simplifies the process of working with collections of data associated with soil profiles, e.g., site-level data, horizon-level data, spatial data, diagnostic horizon data, metadata, etc.. The SPC object inerits many of the familiar methods that operate on vectors and data.frame objects, such as nrow() (“how many horizons”) or length() (“how many pedons”).
Note that the import process in fetchNASIS() is not comprehensive. It does not pull all of the data for every table related to pedon data out of NASIS. Instead, it pulls much of the most commonly used pedon and horizon data. In addition, much of the nested complexity of the NASIS data structure is simplified in the resulting SPC object. Higher level functions like fetchNASIS() bundle a series of lower level functions that get specific parts of the data structure.
One-to-many relationships are flattened where possible by fetchNASIS(). This flattening aggregates the data into one site record with related horizon records. Selected additional data elements that may have a one-to-many relationship to a site or pedon can be gathered from a NASIS selected set via the get_extended_data_from_NASIS_db() function and then joined to the site-level data. More on this process in later…..
In short, the soilDB package greatly simplifies the process of getting pedon data from NASIS into R for further analysis.
After setting up an ODBC connection, you can use R to access data from a selected set defined in your local NASIS database. See this job aid: How to Create an ODBC Connection and Setup SoilDB for Use with R.
Query and load some pedon data into your NASIS selected set.
Does NASIS need to be open and running to query data using soilDB?
No, fetchNASIS() works whether the NASIS application is running or not. You just need to make sure that the data you want has been loaded into your selected set.
Take a moment to open the NASIS client and create a selected set with some site/pedon objects that will be used in the following sections. Using a query that includes both site and pedon objects, download to your local database and selected set sites with MT647 in the user site ID. Depending on the query, you will need to include wildcard characters like this: %MT647%.
fetchNASIS() FunctionWhen you load pedons using the fetchNASIS() function, the following data checks are performed:
Presence of multiple map datums. Results reported to the user and the data are not modified.
Inconsistent horizon boundaries. Pedons with inconsistent horizon boundaries are not loaded. In most cases, this occurs when the bottom depth of a horizon is not the same as the upper depth of the next lower horizon.
## hzname top bot
## 1 A 0 30
## 2 Bt1 38 56
## 3 Bt2 56 121
## 4 Bk 121 135
## 5 R 135
Note the issue above. The bottom depth of the A horizon and the upper depth of the Bt1 horizon should be the same: either 30 or 38 cm. The correct depth needs to be determined.
Missing lower horizon depths. Offending horizons are fixed by replacing the missing bottom depth with the top depth plus 2 cm. In the case of the profile shown above, a bottom depth of 137 cm would be inserted where the depth is missing.
Sites missing pedon records. Data without corresponding horizons are not loaded.
If errors in the pedon data are detected when loading data using fetchNASIS(), the following “get” functions can trace them back to the corresponding records in NASIS:
For more information on the design of soilDB functions, see the following documentation: Introduction to soilDB.
fetchNASIS() is a versatile function have has many arguments (i.e. options) that can be chosen.
fetchNASIS() will always load from the data in the selected set.fetchNASIS().For more information on the data checks and adjusting the default options to fetchNASIS() function, see the following resource: Tips on Getting Data from NASIS into R.
SoilProfileCollection (SPC) ObjectGopheridge Sample DatasetThe gopheridge sample dataset is very similar to the type of data returned from fetchNASIS(). The following demonstration shows the structure of the Soil Profile Collection (SPC) object that is returned by fetchNASIS().
Before proceeding, you may find it helpful to review the following: SoilProfileCollection Object Introduction. This tutorial provides an excellent overview of how the SPC object is constructed. Also, the manual pages for soilDB and aqp are accessible (click index at the bottom of the Help tab in RStudio) by entering the following into the R console:
# not run
library(soilDB)
help(soilDB)
# for links to lots of great examples look here!
library(aqp)
help(aqp)
Open RStudio, and set up the environment by loading packages and the Gopheridge sample dataset.
options(width=95, stringsAsFactors=FALSE)
library(soilDB)
library(aqp)
# load example dataset
data(gopheridge)
# what kind of object is this?
class(gopheridge)
## [1] "SoilProfileCollection"
## attr(,"package")
## [1] "aqp"
# what does the internal structure look like?
str(gopheridge, 2)
## Formal class 'SoilProfileCollection' [package "aqp"] with 11 slots
## ..@ idcol : chr "peiid"
## ..@ hzidcol : chr "phiid"
## ..@ hzdesgncol : chr "hzname"
## ..@ hztexclcol : chr "texcl"
## ..@ depthcols : chr [1:2] "hzdept" "hzdepb"
## ..@ metadata :'data.frame': 1 obs. of 2 variables:
## ..@ horizons :'data.frame': 317 obs. of 69 variables:
## ..@ site :'data.frame': 52 obs. of 88 variables:
## ..@ sp :Formal class 'SpatialPoints' [package "sp"] with 3 slots
## ..@ diagnostic :'data.frame': 162 obs. of 4 variables:
## ..@ restrictions:'data.frame': 56 obs. of 8 variables:
# let's take a look at the fields at the site and horizon levels within the SPC
siteNames(gopheridge)
## [1] "peiid" "pedon_id" "siteiid"
## [4] "site_id" "obs_date" "utmzone"
## [7] "utmeasting" "utmnorthing" "x"
## [10] "y" "horizdatnm" "x_std"
## [13] "y_std" "gpspositionalerror" "describer"
## [16] "pedonpurpose" "pedontype" "pedlabsampnum"
## [19] "labdatadescflag" "tsectstopnum" "tsectinterval"
## [22] "utransectid" "tsectkind" "tsectselmeth"
## [25] "elev_field" "slope_field" "aspect_field"
## [28] "plantassocnm" "earthcovkind1" "earthcovkind2"
## [31] "erocl" "bedrckdepth" "bedrckkind"
## [34] "bedrckhardness" "hillslopeprof" "geomslopeseg"
## [37] "shapeacross" "shapedown" "slopecomplex"
## [40] "drainagecl" "geomposhill" "geomposmntn"
## [43] "geomposflats" "slope_shape" "classdate"
## [46] "classifier" "classtype" "taxonname"
## [49] "localphase" "taxonkind" "seriesstatus"
## [52] "taxpartsize" "taxorder" "taxsuborder"
## [55] "taxgrtgroup" "taxsubgrp" "soiltaxedition"
## [58] "osdtypelocflag" "taxmoistcl" "taxtempregime"
## [61] "taxfamother" "psctopdepth" "pscbotdepth"
## [64] "selection_method" "ecositeid" "ecositenm"
## [67] "ecositecorrdate" "es_classifier" "es_selection_method"
## [70] "ochric.epipedon" "argillic.horizon" "lithic.contact"
## [73] "cambic.horizon" "paralithic.contact" "mollic.epipedon"
## [76] "paralithic.materials" "surface_fgravel" "surface_gravel"
## [79] "surface_cobbles" "surface_stones" "surface_boulders"
## [82] "surface_channers" "surface_flagstones" "surface_paragravel"
## [85] "surface_paracobbles" "landform_string" "pmkind"
## [88] "pmorigin"
horizonNames(gopheridge)
## [1] "peiid" "phiid" "hzname"
## [4] "genhz" "hzdept" "hzdepb"
## [7] "clay" "silt" "sand"
## [10] "fragvoltot" "texture" "texcl"
## [13] "lieutex" "phfield" "effclass"
## [16] "labsampnum" "rupresblkdry" "stickiness"
## [19] "plasticity" "ksatpedon" "texture_class"
## [22] "d_r" "d_g" "d_b"
## [25] "d_hue" "d_value" "d_chroma"
## [28] "d_sigma" "m_r" "m_g"
## [31] "m_b" "m_hue" "m_value"
## [34] "m_chroma" "m_sigma" "moist_soil_color"
## [37] "dry_soil_color" "fine_gravel" "gravel"
## [40] "cobbles" "stones" "boulders"
## [43] "channers" "flagstones" "parafine_gravel"
## [46] "paragravel" "paracobbles" "parastones"
## [49] "paraboulders" "parachanners" "paraflagstones"
## [52] "unspecified" "total_frags_pct_nopf" "total_frags_pct"
## [55] "art_fgr" "art_gr" "art_cb"
## [58] "art_st" "art_by" "art_ch"
## [61] "art_fl" "art_unspecified" "total_art_pct"
## [64] "huartvol_cohesive" "huartvol_penetrable" "huartvol_innocuous"
## [67] "huartvol_persistent" "soil_color" "hzID"
SoilProfileCollection objectThe plot() function applied to a SoilProfileCollection object generates sketches based on horizon depths, designations, and colors. The fetchNASIS() function automatically converts moist Munsell colors into R-style colors. Multiple colors per horizon are mixed. See ?plotSPC for a detailed list of arguments and examples.
par(mar=c(1,1,1,1))
# ommiting pedon IDs and horizon designations
plot(gopheridge, print.id=FALSE, name='', width=0.3)
title('Pedons from the `gopheridge` sample dataset', line=-0.5)
Additional documentation and examples can be found in:
Explore the site- and horizon-level data in your own SPC using the following code. Note: You must have pedons in your local NASIS selected set.
# load required libraries
library(soilDB)
library(aqp)
# load data from a NASIS selected set
pedons <- fetchNASIS(from = 'pedons')
# what kind of object is this?
class(pedons)
# how many pedons
length(pedons)
# let's take a look at the fields at the site and horizon levels within the SPC
siteNames(pedons)
horizonNames(pedons)
# look at the first 2 rows of site and horizon data
head(site(pedons), 2)
head(horizons(pedons), 2)
How can you find out how many site and horizon records are in the data you just loaded?
Quick check: Does the data plot roughly where you expect it?
Plotting the data directly as an R graphic can give you some idea of how the data look spatially and whether their distribution approximates what you expect. Typos are relatively common when coordinates are manually entered. Viewing the data spatially is a quick way to see if any points plot far outside of the geographic area of interest and therefore clearly have an error.
# plot the locations of the gopheridge pedons within R
# Steps:
# 1) subset to a new data frame
# 2) create a spatial points data frame (SPDF)
# 3) plot the data
# load libraries
library(sp)
library(mapview)
data("gopheridge")
# subset standard WGS84 decimal degree coordinates from the gopheridge SPC by specifying column names
gopher.locations <- site(gopheridge)
# initialize coordinates in an SPDF
coordinates(gopher.locations) <- ~ x_std + y_std
# define coordinate system
proj4string(gopher.locations) <- '+proj=longlat +datum=WGS84'
# creat interactive map
mapview(gopher.locations, legend=FALSE, map.types='OpenStreetMap', label=gopher.locations$site_id)
Google Earth is a powerful viewer for point data. Geographic data is displayed in Google Earth using the Keyhole Markup Language (KML) format. Using the plotKML package, you can easily create a KML file to inspect and view in Google Earth. See the related material in this tutorial: Export Pedons to Google Earth.
Another way you can view the data is to export a shapefile from R. For further information, see this tutorial: Export Pedons to Shapefile.
Use the script below to make an R plot of pedon data loaded from your NASIS selected set.
The following script plots the standard lat/long fields from NASIS. In some cases, these fields might be incomplete due to insufficient data or to not having been calculated from UTM coordinates in NASIS. In these cases, you can omit sites with “NA” values in the coordinates a couple of ways. The na.omit() or complete.cases() functions remove any rows in a dataframe that have “NA” values.
Run the following script on the data loaded from your local NASIS selected set:
# load libraries
library(aqp)
library(soilDB)
library(sp)
library(mapview)
# get pedons from the selected set
pedons <- fetchNASIS(from = 'pedons')
# subset standard WGS84 decimal degree coordinates from the
# gopheridge SPC by specifying column names
pedons.sp <- site(pedons)[, c('site_id', 'x_std', 'y_std')]
nrow(pedons.sp)
# remove any sites lacking standard lat/long coordinates
# notice that there may now be fewer rows of data
pedons.sp <- na.omit(pedons.sp)
nrow(pedons.sp)
# initialize coordinates in an SPDF
coordinates(pedons.sp) <- ~ x_std + y_std
# define coordinate system
proj4string(pedons.sp) <- '+proj=longlat +datum=WGS84'
# plot
mapview(pedons.sp, legend=FALSE, map.types='OpenStreetMap', label=pedons.sp$site_id)
Now that you’ve loaded some data, you can look at additional ways to summarize data elements and filter the SPC to specific sites of interest. The table() function is very useful for quick summary operations. This function can be combined with other functions, such as sort() and is.na() or !is.na() (is not NA). Follow along with your own data.
# summarize which soil taxa we have loaded
table(pedons$taxonname)
# sort results in descending order
sort(table(pedons$taxonname), decreasing=TRUE)
# could do the same thing for taxonomic subgroups or any column of the SPC at the site or horizon levels
table(pedons$taxsubgrp)
sort(table(pedons$taxsubgrp), decreasing=TRUE)
# table() is also useful when testing for null data using IS NA, is.na() or IS NOT NA, !is.na()
table(is.na(pedons$taxsubgrp))
table(!is.na(pedons$taxsubgrp))
# it can also be applied to horizon level columns in the SPC
sort(table(pedons$texture), decreasing=TRUE)
A variety of methods are available to subset a collection of soil profiles or an SPC. The results can then be placed into another SPC. This capacity can be useful for generating subset SPC objects from the original dataset.
%in% Equivalent to IN () in SQL. Can use c() to concatenate lists of vectors.
pedons$taxpartsize %in% c('loamy-skeletal', 'sandy-skeletal')!= Not-equal-to character “string.”== Note in the example above that R uses a double equal sign as “equal to.”<, >, <=, >= Less than, greater than, less than or equal to, and greater than or equal to.The following examples use the grep() function to pattern match within the data, create an index of the SPC for records that match the specified pattern within that column, and then use that index to filter to specific sites and their corresponding profiles. Patterns are specified in regular expression (REGEX) syntax.
This process can be applied to many different columns in the SPC based on how you need to filter the data. This example pattern matches on the tax_subgroup column, but another useful application might be to pattern match on geomorphology or parent material.
Note that the grep() below also has an invert option, which is specified as either true or false (the default is false). When set to true, this option is very useful for excluding the results of the pattern matching process by inverting the selection.
# say we wanted to look at what the variation of particle size classes are within a specific subgroup?
# use of grep() to filter and create an index, then apply that index to the SPC
# and create a new SPC called 'f1' using the square bracket notation
idx <- grep('lithic', pedons$taxsubgrp, invert=FALSE)
# save this subset of 'lithic' soils for later use
subset1 <- pedons[idx, ]
# or use the index directly to summarize a field
sort(table(pedons$taxpartsize[idx]), decreasing=TRUE)
Do a quick graphical check to ensure that the “lithic” profiles are selected. Plot them in R using the SoilProfileCollection “plot” method (e.g., specialized version of the generic plot() function).
# adjust margins
par(mar=c(1,0,0,1))
# plot the first 10 profiles of subset1
# limit plotting to a depth of about 60cm
plot(subset1[1:10, ], label='site_id', max.depth=60)
title('Pedons with the word "lithic" at subgroup-level of Soil Taxonomy', line=-2)
For more information on using regular expressions in grep() for pattern matching operations, see: Regular-expression-syntax.
| Equivalent to “or” in SQL.
grep('loamy | sandy', pedons$taxpartsize)^ Anchors to the left side of the string.
grep('^sandy', pedons$taxpartsize).$ Anchors to the right side of the string.
grep('skeletal$', pedons$taxpartsize).which() FunctionAnother method of subsetting a collection of soil profiles is to specify a criteria using the which() function. The following examples use the which() and grep() functions to reference the indexing of the SPC to create subsets and to filter for specific sites or their corresponding profiles.
# say we wanted to look at what the variation of particle size classes are within a specific subgroup?
# first: use grep to pattern match the tax_subgroup field for the string 'aqu'
idx <- grep('lithic', pedons$taxsubgrp)
# save this subset
subset1 <- pedons[idx, ]
# check taxonomic range of particle size classes in the data
sort(table(subset1$taxsubgrp), decreasing=TRUE)
sort(table(subset1$taxpartsize), decreasing=TRUE)
# then further query the subset for only those profiles with particle size class of 'sandy-skeletal'
# notice: a double equal sign '==' is used for exact character or numeric criteria
idx <- which(subset1$taxpartsize == 'loamy')
# save this subset
subset2 <- subset1[idx, ]
table(subset2$taxpartsize)
# plot profiles 1 thru 10
par(mar=c(0,0,2,1))
plot(subset2, label='site_id')
title('Loamy particle size control section class')
Soil Profile Collections are designed to be dismantled so they can work more easily with either site or horizon data. The SPC has a slot for site-level data and a slot for horizon-level data. You can reference these slots using the site() and horizons() functions within the AQP package. These “get” functions extract all the site or horizon variables as a dataframe for further use.
# extract site data from SPC into dataframe 's'
s <- site(pedons)
names(s)
# extract horizon data from SPC into dataframe 'h'
h <- horizons(pedons)
names(h)
You can also use these functions when referencing the data within an SPC to specify that you want to look specifically in the site or horizon data.
fetchNASIS()Now that you’ve loaded some data and learned a little about how to filter data in the SPC, you can quickly review some of the get() functions used to track data issues detected in the process of loading data back to the NASIS records in your selected set.
# use each one of these to return a vector of the pedons where errors were detected
#get('sites.missing.pedons', envir=soilDB.env)
#get('dup.pedon.ids', envir=soilDB.env)
#get('bad.pedon.ids', envir=soilDB.env)
# example of pedon_id's returned
#[1] "2011MT0810001" "2011MT0810009" "2011MT0810015" "2011MT0810027" "2011MT0810034"
#get('bad.horizons', envir=soilDB.env)
# How could you then remove these from your SPC?
# since the get() returns the string of bad pedon id's we can use a which() to query any pedon id's that don't match the bad id's
idx <- which(! pedons$pedon_id %in% get('bad.pedon.ids', envir=soilDB.env))
pedons <- pedons[idx, ]
Another useful function is dput(), which concatenates a variable. It converts something like this:
“2011MT0810001” “2011MT0810009” “2011MT0810015” “2011MT0810027” “2011MT0810034”
into a comma delimited string like this:
c(“2011MT0810001”, “2011MT0810009”, “2011MT0810015”, “2011MT0810027”, “2011MT0810034”).
Such a string can be copied and then pasted back as a concatenated string or could even be used as string for NASIS list queries. The dput() function is also helpful when sending questions or examples to colleagues via email.
Additional data related to both site and horizon information can be fetched using the get_extended_data_from_NASIS() function. This data is not automatically brought into R because these data elements are typically related to the site or horizon data in one-to-many relationships. For example, multiple diagnostic features could exist within one pedon. Below is a summary of additional information that can be readily brought into R from your NASIS selected set via the get_extended_data_from_NASIS() function.
# fetch extended site and horizon data
e <- get_extended_data_from_NASIS_db()
### site and pedon related extended data
# vegetation data summary
colnames(e$ecositehistory)
# diagnostic features
colnames(e$diagnostic)
# surface rock fragments
colnames(e$surf_frag_summary)
# geomorphic description
colnames(e$geomorph)
# taxonomic history data
colnames(e$taxhistory)
# linked photo stored in site textnotes
colnames(e$photo)
# site parent materials
colnames(e$pm)
### horizon related extended data
# rock fragments
colnames(e$frag_summary)
# soil texture modifers
colnames(e$texmodifier)
# soil structure data
colnames(e$struct)
The “Geomorphic description” and “parent materials” attributes are important soil data. They can be useful handles for exploring other data. The soilDB package flattens the nested table structure of parent material and geomorphic description within NASIS into single strings for each site-level record. The pattern matching concepts demonstrated above select profiles based on parts of these strings. The following code generates a simple graphical summary of the 10 most commonly occurring landforms in fetchNASIS() data so you can see their frequency of occurrence.
# graphically tabulate the occurrence of landforms
# load required libraries
library(soilDB)
# required for dotchart2()
library(Hmisc)
# load data from a NASIS selected set
pedons <- fetchNASIS(from = 'pedons')
# create 'lf' object of landform factors sorted in descending order
lf <- sort(table(pedons$landform_string), decreasing = TRUE)
# plot top 10 or length, whichever is shorter
dotchart2(lf[1:pmin(10, length(lf))], col='black', xlim = c(0, max(lf)), cex.labels = 0.75)
If diagnostic features are populated in the pedon diagnostic features table in NASIS, then Boolean (TRUE or FALSE) fields are created for each diagnostic feature in the data brought into R by soilDB. These fields can be readily used to model the presence or absence of a diagnostic soil feature by extracting the site data.
You could use the following code to pull the upper depth to calcium carbonates using the “calcic horizon” and/or the “secondary carbonates” diagnostic features. However, data consistency is critical. You can only use those fields if they are consistently populated for all pedons that you are working with in your selected set. As you start working with larger pedon data sets, you will quickly find that there can be great inconsistencies in the way the data were populated by different people in different offices on different surveys over different time frames.
The following is an example of how you could use the diagnostic features (if populated!) from the extended data to determine the thickness of a diagnostic feature of interest.
# rename gopheridge data
data("gopheridge")
f <- gopheridge
# get diagnostic features associated with pedons loaded from selected set
d <- diagnostic_hz(f)
# summary of the diagnostic features in your data!
unique(d$featkind)
## [1] ochric epipedon argillic horizon lithic contact cambic horizon
## [5] paralithic contact mollic epipedon paralithic materials <NA>
## 84 Levels: anthropic epipedon abrupt textural change andic soil properties ... manufactured layer contact
sort(table(droplevels(factor(d$featkind))), decreasing = TRUE)
##
## ochric epipedon argillic horizon lithic contact paralithic contact
## 51 44 33 19
## cambic horizon mollic epipedon paralithic materials
## 12 1 1
# subset argillic horizons
d <- d[d$featkind == 'argillic horizon', ]
# create a new column and subtract the upper from the lower depth
d$argillic_thickness_cm <- d$featdepb - d$featdept
# create another new column with the upper depth to the diagnostic feature
d$depth_to_argillic_cm <- d$featdept
# omit NA values
d <- na.omit(d)
# subset to pedon records IDs and calculated thickness
d <- d[, c('peiid', 'argillic_thickness_cm', 'depth_to_argillic_cm')]
head(d)
## peiid argillic_thickness_cm depth_to_argillic_cm
## 2 242808 63 18
## 5 252851 50 18
## 7 268791 43 15
## 10 268793 71 10
## 13 268794 50 5
## 16 268795 46 15
# join these data with existing site data
site(f) <- d
# plot as histogram
# reset figure margins
par(mar = c(4.5,4.5,1,1))
hist(f$argillic_thickness_cm, xlab = 'Thickness of argillic diagnostic (cm)', main='')
hist(f$depth_to_argillic_cm, xlab = 'Depth to argillic diagnostic (cm)', main = '')
# start fresh with your own data
f <- fetchNASIS(from = 'pedons')
# get diagnostic features associated with pedons loaded from selected set
d <- diagnostic_hz(f)
# summary of the diagnostic features in your data!
unique(d$featkind)
# top 5 most frequent
sort(table(d$featkind), decreasing = TRUE)[1:5]
# subset argillic horizons - or choose your own diagnostic feature and modify this script!
#idx <- which(d$diag_kind == 'your_diagnostic')
#d <- d[idx, ]
# how would you do the rest.....see if you can work it out!
What can you do with the Boolean diagnostic feature data?
## work up diagnostic plot based on gopheridge dataset
library(aqp)
library(soilDB)
library(sharpshootR)
# load data
data(gopheridge)
# can limit which diagnostic features to show by setting 'v' manually
v <- c('ochric.epipedon', 'cambic.horizon', 'argillic.horizon', 'paralithic.contact', 'lithic.contact')
# generate diagnostic property diagram
diagnosticPropertyPlot(gopheridge, v, k=5, grid.label='site_id', dend.label = 'taxonname', sort.vars = FALSE)
# plot again, this time with diagnostic features ordered according to co-occurrence
diagnosticPropertyPlot(gopheridge, v, k=5, grid.label='site_id', dend.label = 'taxonname', sort.vars = TRUE)
Use the following script to generate a diagnostic-feature diagram for the pedon data you’ve loaded from your NASIS selected set. Note: If the data includes more than about 20 pedons, the script might generate figures that are very hard to read. You also need to be certain that pedon diagnostic feature were populated in your data.
library(soilDB)
library(sharpshootR)
# load data
f <- fetchNASIS(from = 'pedons')
# may need to subset to a particular series or taxa here....to reduce the number of pedons!
# select a series of diagnostic properties or automatically pull diagnostic feature columns
# get all diagnostic feature columns from site data by pattern matching on '[.]' in the colnames
idx <- grep('[.]', colnames(site(f)))
v <- colnames(site(f))[idx]
v
# or insert diagnostics of interest found in your data here from the list of possible diagnostics in 'v'
v <- c('ochric.epipedon', 'cambic.horizon', 'argillic.horizon', 'paralithic.contact', 'lithic.contact')
# generate diagnostic property diagram
diagnosticPropertyPlot(f, v, k=5, grid.label='site_id', dend.label = 'taxonname')
For more information on generating diagnostic feature diagrams, see the following tutorial: Diagnostic Feature Property Plots.
There are times when it is necessary to send customized queries to the local NASIS database. Queries are written in T-SQL which is the dialect of SQL used to communicate with Microsoft SQL Servers (e.g. the local NASIS database). The fetchNASIS and related convenience functions are essentially wrappers around commonly used chunks of SQL.
The following example will return all records from the sitesoiltemp table, along with a couple of fields from the site, siteobs, and pedon tables. This is a convenient way to collect all of the field-based soil temperature data associated with the pedons in your selected set for further analysis.
library(soilDB)
library(RODBC)
# write query as a long text object
q <- "
-- columns to return
SELECT siteiid as siteiid, peiid, usiteid as site_id, upedonid as pedon_id, obsdate as obs_date,
soitemp, soitempdep
FROM
-- tables that are queried and join conditions
site_View_1
INNER JOIN siteobs_View_1 ON site_View_1.siteiid = siteobs_View_1.siteiidref
LEFT OUTER JOIN sitesoiltemp_View_1 ON siteobs_View_1.siteobsiid = sitesoiltemp_View_1.siteobsiidref
LEFT OUTER JOIN pedon_View_1 ON siteobs_View_1.siteobsiid = pedon_View_1.siteobsiidref
-- ordering of rows
ORDER BY obs_date, siteiid;"
# setup connection local NASIS
channel <- odbcDriverConnect(connection = getOption("soilDB.NASIS.credentials"))
# exec query
d <- sqlQuery(channel, q, stringsAsFactors=FALSE)
# close connection
odbcClose(channel)
# check results
str(d)
# remove records missing values
d <- na.omit(d)
# tabulate unique soil depths
table(d$soitempdep)
# extract doy of year
d$doy <- as.integer(format(d$obs_date, "%j"))
# when where measurements collected?
hist(d$doy, xlim=c(1,366), breaks=30, las=1, main='Soil Temperature Measurements', xlab='Day of Year')
# soil temperature by day of year
plot(soitemp ~ doy, data=d, type='p', xlim=c(1, 366), ylim=c(-1, 25), xlab='Day of Year', ylab='Soil Temperature at 50cm (deg C)', las=1)
This document is based on aqp version 1.19, soilDB version 2.5.1, and sharpshootR version 1.6.